Multiple kernels learning-based biological entity relationship extraction method

نویسندگان

  • Dongliang Xu
  • Jingchang Pan
  • Bailing Wang
چکیده

BACKGROUND Automatic extracting protein entity interaction information from biomedical literature can help to build protein relation network and design new drugs. There are more than 20 million literature abstracts included in MEDLINE, which is the most authoritative textual database in the field of biomedicine, and follow an exponential growth over time. This frantic expansion of the biomedical literature can often be difficult to absorb or manually analyze. Thus efficient and automated search engines are necessary to efficiently explore the biomedical literature using text mining techniques. RESULTS The P, R, and F value of tag graph method in Aimed corpus are 50.82, 69.76, and 58.61%, respectively. The P, R, and F value of tag graph kernel method in other four evaluation corpuses are 2-5% higher than that of all-paths graph kernel. And The P, R and F value of feature kernel and tag graph kernel fuse methods is 53.43, 71.62 and 61.30%, respectively. The P, R and F value of feature kernel and tag graph kernel fuse methods is 55.47, 70.29 and 60.37%, respectively. It indicated that the performance of the two kinds of kernel fusion methods is better than that of simple kernel. CONCLUSION In comparison with the all-paths graph kernel method, the tag graph kernel method is superior in terms of overall performance. Experiments show that the performance of the multi-kernels method is better than that of the three separate single-kernel method and the dual-mutually fused kernel method used hereof in five corpus sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Learning with (Multiple) Kernels: A Review

This review examines kernel methods for online learning, in particular, multiclass classification. We examine margin-based approaches, stemming from Rosenblatt's original perceptron algorithm, as well as nonparametric probabilistic approaches that are based on the popular gaussian process framework. We also examine approaches to online learning that use combinations of kernels--online multiple ...

متن کامل

Neural Network-Based Learning Kernel for Automatic Segmentation of Multiple Sclerosis Lesions on Magnetic Resonance Images

Background: Multiple Sclerosis (MS) is a degenerative disease of central nervous system. MS patients have some dead tissues in their brains called MS lesions. MRI is an imaging technique sensitive to soft tissues such as brain that shows MS lesions as hyper-intense or hypo-intense signals. Since manual segmentation of these lesions is a laborious and time consuming task, automatic segmentation ...

متن کامل

Discovering Relations Between Named Entities from a Large Raw Corpus Using Tree Similarity-Based Clustering

We propose a tree-similarity-based unsupervised learning method to extract relations between Named Entities from a large raw corpus. Our method regards relation extraction as a clustering problem on shallow parse trees. First, we modify previous tree kernels on relation extraction to estimate the similarity between parse trees more efficiently. Then, the similarity between parse trees is used i...

متن کامل

Exploiting Tree Kernels for High Performance Chemical Induced Disease Relation Extraction

Machine learning approaches based on supervised classification have emerged as effective methods for Biomedical relation extraction such as the Chemical-InducedDisease (CID) task. These approaches owe their success to a rich set of features crafted from the lexical and syntactic regularities in the text. Kernel methods are an effective alternative to manual feature engineering and have been suc...

متن کامل

Subsequence Kernels for Relation Extraction

We present a new kernel method for extracting semantic relations between entities in natural language text, based on a generalization of subsequence kernels. This kernel uses three types of subsequence patterns that are typically employed in natural language to assert relationships between two entities. Experiments on extracting protein interactions from biomedical corpora and top-level relatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2017